Markerless tracking of hands and fingers is a promising enabler forhuman-computer interaction. However, adoption has been limited because oftracking inaccuracies, incomplete coverage of motions, low framerate, complexcamera setups, and high computational requirements. In this paper, we present afast method for accurately tracking rapid and complex articulations of the handusing a single depth camera. Our algorithm uses a novel detection-guidedoptimization strategy that increases the robustness and speed of poseestimation. In the detection step, a randomized decision forest classifiespixels into parts of the hand. In the optimization step, a novel objectivefunction combines the detected part labels and a Gaussian mixturerepresentation of the depth to estimate a pose that best fits the depth. Ourapproach needs comparably less computational resources which makes it extremelyfast (50 fps without GPU support). The approach also supports varying static,or moving, camera-to-scene arrangements. We show the benefits of our method byevaluating on public datasets and comparing against previous work.
展开▼